02-1: First-order Spatial Point Pattern Analysis

Author

Yiqiong PAN

Published

September 2, 2025

Modified

September 13, 2025

1. Overview

Spatial Point Pattern Analysis (SPPA) refers to the study of how points are arranged or distributed across a given surface. There points may represent:

  • Events, such as crimes, road accidents, or disease occurrences, or

  • Service and facility locations, including shops (e.g., cafes, supermarkets) and community facilities like childcare or aged care centres.

First-order Spatial Point Pattern Analysis (1-st SPPA) examines the intensity or density of points across a study area. It identifies spatial trends in point distribution without considering interaction between points helping to answer questions like:

  • Where are points most concentrated?

  • Is density uniform or variable?

  • How dispersed is the pattern?

In this exercise, the spatsat is utilised to apply two common 1st-SPPA methods to explore:

  1. Whether childcare centres in Singapore are randomly distributed;

  2. If not, identifying areas with higher concentrations of centres.

2. The Data

The following datasets were downloaded from publicly available websites, and both are available in KML and GeoJSON format.

Dataset Name Source Discrption
Child Care Services data.gov.sg Point feature data: contains the locations and attributes of childcare centres.
Master plan 2019 Subzone Boundary (No Sea) singstat Polygon feature data: represents the URA 2019 Master Plan planning subzone boundaries.

3. Installing and Loading the R Packages

In addition to spatstat, a total of five R packages will be used in this exercise.

Package Discription
sf Simple Features, a new R package which handles importing, managing, and processing vector-based geospatial data.
spatstat Provides useful functions for SPPA, which will be called to conduct both 1st and 2nd SPPA and KDE.
terra Modern package for raster/vector spatial data and will be used to convert spastat outputs into terra format.
tmap Creates high quality static or interactive choropleth maps via leaflet.
rvest Scrapes and extracts data from web pages.

After installation, we load them into R environment using the code below.

pacman:: p_load(sf, terra, spatstat, tmap, rvest, tidyverse)

4. Importing and Wrangling Geospatial Data Sets

The following code chunk shows the steps to first import the Master plan 2019 Subzone Boundary (No Sea) data using st_read, extract the required 4 columns from the Description field, filter out the nearby islands, and finally save the file as mpsz_cl for further analysis.

mpsz_sf <- st_read("data/MasterPlan2019SubzoneBoundaryNoSeaKML.kml") %>%
  st_zm(drop = TRUE, what = "ZM") %>%
  st_transform(crs = 3414)
Reading layer `URA_MP19_SUBZONE_NO_SEA_PL' from data source 
  `C:\lsrgc\ISSS626-yiqiong-pan\Hands-on_Ex\Hands-on_Ex02\data\MasterPlan2019SubzoneBoundaryNoSeaKML.kml' 
  using driver `KML'
Simple feature collection with 332 features and 2 fields
Geometry type: MULTIPOLYGON
Dimension:     XY
Bounding box:  xmin: 103.6057 ymin: 1.158699 xmax: 104.0885 ymax: 1.470775
Geodetic CRS:  WGS 84
extract_kml_field <- function(html_text, field_name) {
  if (is.na(html_text) || html_text == "") return(NA_character_)
  
  page <- read_html(html_text)
  rows <- page %>% html_elements("tr")
  
  value <- rows %>%
    keep(~ html_text2(html_element(.x, "th")) == field_name) %>%
    html_element("td") %>%
    html_text2()
  
  if (length(value) == 0) NA_character_ else value
}
# map_chr of purr (tidyverse) applies a function to each element of a list/vector and returns a character vector.
mpsz_sf <- mpsz_sf %>%
  mutate(
    REGION_N = map_chr(Description, extract_kml_field, "REGION_N"),
    PLN_AREA_N = map_chr(Description, extract_kml_field, "PLN_AREA_N"),
    SUBZONE_N = map_chr(Description, extract_kml_field, "SUBZONE_N"),
    SUBZONE_C = map_chr(Description, extract_kml_field, "SUBZONE_C")
  ) %>%
  select(-Name, -Description) %>%
  relocate(geometry, .after = last_col())
mpsz_cl <- mpsz_sf %>%
  filter(SUBZONE_N != "SOUTHERN GROUP",
         PLN_AREA_N != "WESTERN ISLANDS",
         PLN_AREA_N != "NORTH-EASTERN ISLANDS")
write_rds(mpsz_cl,
          "data/mpsz_cl.rds")

The code chuck below imports downloaded ChildCareServices data to R as sf data frame as childcare_sf by using st_read, coverts 3d to 2d (st_zm) and finally transform the CRS from WGS84 to SVY21.

childcare_sf <- st_read("data/ChildCareServices.geojson") %>%
  st_zm(drop = TRUE, what = "ZM") %>% # Drop Z and M to convert from multi-dimensional to 2d (XY)
  st_transform(crs = 3414)
Reading layer `ChildCareServices' from data source 
  `C:\lsrgc\ISSS626-yiqiong-pan\Hands-on_Ex\Hands-on_Ex02\data\ChildCareServices.geojson' 
  using driver `GeoJSON'
Simple feature collection with 1925 features and 2 fields
Geometry type: POINT
Dimension:     XYZ
Bounding box:  xmin: 103.6878 ymin: 1.247759 xmax: 103.9897 ymax: 1.462134
z_range:       zmin: 0 zmax: 0
Geodetic CRS:  WGS 84

4.1 Mapping the Geospatial Data Sets

Using the tmap mapping methods, the code chunk below creates a map combining childcare_sf and mpsz_cl.

tm_shape(mpsz_cl)+
  tm_fill(col = "grey" ) +
  tm_borders(col = "black") +
tm_shape(childcare_sf) +
  tm_dots(col = "black")

Alternatively, an interactive thematic map can be plotted using the code below. The interactive map is easy to navigate and query intuitively. It is optional to change the background map layer(choices: ESRI.WorldGrayCanvas(default), OpenStreetMap, ESRI.WorldTopoMap).

tmap_mode('view')
tm_shape(childcare_sf) +
  tm_dots() #creates a layer of dots to visualise point data on a map.
tmap_mode('plot') #switch back static maps
Warning

It is advised to always switch back to plot mode to save connection consumption and limit the number of interactive maps to 10 in one documents when publishing.

5. Geospatial Data Wrangling

In this section, the data (sf objects) will be converted to spatstat data structure: ppp (for point data) and owin for observation window.

5.1 Converting sf Data Frames to PPP

Here we use as.ppp() of spatstat package to covert the point data childcare_sf to ppp file, confirm the change using class() and have a quick overview of the data statistics via summary().

childcare_ppp <- as.ppp(childcare_sf)
class(childcare_ppp)
[1] "ppp"
summary(childcare_ppp)
Marked planar point pattern:  1925 points
Average intensity 2.417323e-06 points per square unit

Coordinates are given to 11 decimal places

Mark variables: Name, Description
Summary:
     Name           Description       
 Length:1925        Length:1925       
 Class :character   Class :character  
 Mode  :character   Mode  :character  

Window: rectangle = [11810.03, 45404.24] x [25596.33, 49300.88] units
                    (33590 x 23700 units)
Window area = 796335000 square units

Before moving forwards, let’s check if there are any duplicated points.

any(duplicated(childcare_ppp))
[1] FALSE

5.2 Creating Owin Object

Similarly, the owin object can be created using the function as.owin() for polygon data. After the conversion, the class() and plot() functions can be used to verify that the object is of the correct class and that the data retains its original shape.

sg_owin <- as.owin(mpsz_cl)
class(sg_owin)
[1] "owin"
plot(sg_owin)

5.3 Combining Point Events object and Owin Object

The code chunk below combines ppp and owin into one ppp file which means it updates the window of childcare_ppp to sg_owin and keeps the points that fall inside.

childcareSG_PPP = childcare_ppp[sg_owin]
class(childcareSG_PPP)
[1] "ppp"
childcareSG_PPP
Marked planar point pattern: 1925 points
Mark variables: Name, Description 
window: polygonal boundary
enclosing rectangle: [2667.54, 55941.94] x [21448.47, 50256.33] units

why does plotting not work here?

summary(childcareSG_PPP)
Marked planar point pattern:  1925 points
Average intensity 2.875208e-06 points per square unit

Coordinates are given to 11 decimal places

Mark variables: Name, Description
Summary:
     Name           Description       
 Length:1925        Length:1925       
 Class :character   Class :character  
 Mode  :character   Mode  :character  

Window: polygonal boundary
41 separate polygons (26 holes)
                  vertices         area relative.area
polygon 1              285  1.61128e+06      2.41e-03
polygon 2               27  1.50315e+04      2.25e-05
polygon 3 (hole)        41 -4.01660e+04     -6.00e-05
polygon 4 (hole)       317 -5.11280e+04     -7.64e-05
polygon 5 (hole)         3 -4.14100e-04     -6.19e-13
polygon 6               30  2.80002e+04      4.18e-05
polygon 7 (hole)         4 -2.86396e-01     -4.28e-10
polygon 8 (hole)         3 -1.81439e-04     -2.71e-13
polygon 9 (hole)         3 -5.99531e-04     -8.95e-13
polygon 10 (hole)        3 -3.04560e-04     -4.55e-13
polygon 11 (hole)        3 -4.46108e-04     -6.66e-13
polygon 12 (hole)        5 -2.44408e-04     -3.65e-13
polygon 13 (hole)        5 -3.64686e-02     -5.45e-11
polygon 14              71  8.18750e+03      1.22e-05
polygon 15 (hole)       38 -7.79904e+03     -1.16e-05
polygon 16              91  1.49663e+04      2.24e-05
polygon 17 (hole)      395 -7.38124e+03     -1.10e-05
polygon 18              40  1.38607e+04      2.07e-05
polygon 19 (hole)       11 -8.36705e+01     -1.25e-07
polygon 20 (hole)        3 -2.33435e-03     -3.49e-12
polygon 21              45  2.51218e+03      3.75e-06
polygon 22             139  3.22293e+03      4.81e-06
polygon 23             148  3.10395e+03      4.64e-06
polygon 24 (hole)        4 -1.72650e-04     -2.58e-13
polygon 25              75  1.73526e+04      2.59e-05
polygon 26              83  5.28920e+03      7.90e-06
polygon 27             106  3.04104e+03      4.54e-06
polygon 28              71  5.63061e+03      8.41e-06
polygon 29              10  1.99717e+02      2.98e-07
polygon 30 (hole)        3 -1.37223e-02     -2.05e-11
polygon 31 (hole)        3 -8.68789e-04     -1.30e-12
polygon 32 (hole)        3 -3.39815e-04     -5.08e-13
polygon 33 (hole)        3 -4.52041e-05     -6.75e-14
polygon 34 (hole)        3 -3.90173e-05     -5.83e-14
polygon 35 (hole)        3 -9.59845e-05     -1.43e-13
polygon 36 (hole)        8 -4.28707e-01     -6.40e-10
polygon 37 (hole)        4 -2.18619e-04     -3.27e-13
polygon 38 (hole)        6 -8.37554e-01     -1.25e-09
polygon 39 (hole)        5 -2.92235e-04     -4.36e-13
polygon 40           14053  6.67892e+08      9.98e-01
polygon 41 (hole)        3 -7.43616e-06     -1.11e-14
enclosing rectangle: [2667.54, 55941.94] x [21448.47, 50256.33] units
                     (53270 x 28810 units)
Window area = 669517000 square units
Fraction of frame area: 0.436
plot(childcareSG_PPP)

6. Clark-Evans Test for Nearest Neighbour Anaylysis (NNA)

Nearest Neighbor Analysis (NNA) is a general spatial analytical approach that examines the distribution of points by calculating the average nearest-neighbour distance to identify whether the pattern is clustered, random, or regular.

The Clark-Evans Test is a specific method within the NNA framework that uses the Clark-Evans aggregation index (R) to evaluate the spatial distribution of points.

The interpretation rule is:

𝑅< 1: clustered pattern

𝑅= 1: random pattern

𝑅> 1: regular pattern

In this study, the function clarkevans.test() from the spatstat package will be used. The hypotheses are defined as follows:

Null hypothesis (H0): Childcare services are randomly distributed.

A 95% confidence level will be applied for the test.

6.1 Performing the Clark-Evans Test without CSR (Monte Carlo)

clarkevans.test()has two modes without CSR and with CSR. Here we try out the mode without CSR. The alternative hypothesis (H1) here is the childcare services in Singapore are clustered.

Statistical Conclusion Since the Clark-Evans test shows an aggregation index R 0.535 which is less than 1 and p value is close to zero, we reject the null hypothesis of random pattern and conclude the childcare services in Singapore are clustered at 95% confidence level.

Business Communication The statistical results indicate that childcare centres in Singapore are unevenly distributed across the island. Given the critical role of childcare services in supporting social welfare and community stability, it is essential to zoom in and identify underserved areas and prioritise them in future planning.

clarkevans.test(childcareSG_PPP,
                correction = "none",
                clipregion = "sg_owin",
                alternative= c("clustered")
                )

    Clark-Evans test
    No edge correction
    Z-test

data:  childcareSG_PPP
R = 0.53532, p-value < 2.2e-16
alternative hypothesis: clustered (R < 1)

6.2 Performing the Clark-Evans Test with CSR (Monte Carlo?)

The code chunk below performs Clark-Evans Test with 99 times simulation.

Statistical Conclusion The Clark-Evans test shows an aggregation index R 0.535 and p value of 0.01. As a result we reject the null hypothesis of random pattern (CSR) and conclude the childcare services in Singapore are more clustered at 95% confidence level.

Business Communication The statistical results based on 99 Monte Carlo simulations confirm that childcare centres in Singapore are not evenly distributed. It is therefore essential to identify underserved areas and prioritise them in future planning. Policymakers may consider providing incentives to attract service providers to establish new centres in these regions or, in the interim, offering subsidies to local families to ensure a smoother transition and equitable access to childcare services.

clarkevans.test(childcareSG_PPP,
                correction ="none",
                clipregion ="sg_owin",
                alternative= "less",
                method = "MonteCarlo",
                nsim = 99)

    Clark-Evans test
    No edge correction
    Monte Carlo test based on 99 simulations of CSR with fixed n

data:  childcareSG_PPP
R = 0.53532, p-value = 0.01
alternative hypothesis: clustered (R < 1)

7. Kernel Density Estimation Method

Kernel Density Estimation (KDE) creates a continuous surface from point data, allowing us to visualise clusters, hotspots, and spatial variation.It provides an initial understanding of childcare service distribution in Singapore.

7.1 Working with Automatic Bandwidth Selection Method

The density() function is used to compute a KDE of the childcareSG_PPP point pattern and smooth these point locations into a continuous intensity surface. sigma is the bandwidth (smoothing parameter): a smaller sigma produces a spikier output, while a larger sigma smooths the surface. Bandwidth selectors include bw.diggle, bw.CvL, bw.scott, and bw.ppl. Enabling edge correction helps avoid border bias. The kernel function defines the shape of the smoothing: “gaussian” is the smoothest, while alternatives such as “epanechnikov”, “quartic”, or “disc” produce sharper surfaces.

kde_SG_diggle <- density (
  childcareSG_PPP,
  sigma = bw.diggle,
  edge = TRUE,
  kernel = "gaussian")

The density() function in spatstat returns an im object, a pixel image that stores raster data as a grid of values. Each pixel represents intensity at that location, and the result can be visualised using plot() and summary() for a quick overview.

plot(kde_SG_diggle)

summary(kde_SG_diggle)
real-valued pixel image
128 x 128 pixel array (ny, nx)
enclosing rectangle: [2667.538, 55941.94] x [21448.47, 50256.33] units
dimensions of each pixel: 416 x 225.0614 units
Image is defined on a subset of the rectangular grid
Subset area = 669941961.12249 square units
Subset area fraction = 0.437
Pixel values (inside window):
    range = [-6.584123e-21, 3.063698e-05]
    integral = 1927.788
    mean = 2.877545e-06

We can also check the bandwidth using bw.diggle().

bw <- bw.diggle(childcareSG_PPP)
bw
   sigma 
295.9712 

7.2 Rescalling KDE Values

As seen in the summary statistic of kde_SG_diggle, the KDE output ranges from 0 to 0.000037 which is hard to connect with real world interpretation. Since this is due to the unit in SVY21 is metre, here we convert the unit from metre to kilometre using rescale.ppp().

childcareSG_ppp_km <- rescale.ppp(
  childcareSG_PPP, 1000, "km")
summary(childcareSG_ppp_km)
Marked planar point pattern:  1925 points
Average intensity 2.875208 points per square km

Coordinates are given to 14 decimal places

Mark variables: Name, Description
Summary:
     Name           Description       
 Length:1925        Length:1925       
 Class :character   Class :character  
 Mode  :character   Mode  :character  

Window: polygonal boundary
41 separate polygons (26 holes)
                  vertices         area relative.area
polygon 1              285  1.61128e+00      2.41e-03
polygon 2               27  1.50315e-02      2.25e-05
polygon 3 (hole)        41 -4.01660e-02     -6.00e-05
polygon 4 (hole)       317 -5.11280e-02     -7.64e-05
polygon 5 (hole)         3 -4.14100e-10     -6.19e-13
polygon 6               30  2.80002e-02      4.18e-05
polygon 7 (hole)         4 -2.86396e-07     -4.28e-10
polygon 8 (hole)         3 -1.81440e-10     -2.71e-13
polygon 9 (hole)         3 -5.99531e-10     -8.95e-13
polygon 10 (hole)        3 -3.04560e-10     -4.55e-13
polygon 11 (hole)        3 -4.46108e-10     -6.66e-13
polygon 12 (hole)        5 -2.44408e-10     -3.65e-13
polygon 13 (hole)        5 -3.64686e-08     -5.45e-11
polygon 14              71  8.18750e-03      1.22e-05
polygon 15 (hole)       38 -7.79904e-03     -1.16e-05
polygon 16              91  1.49663e-02      2.24e-05
polygon 17 (hole)      395 -7.38124e-03     -1.10e-05
polygon 18              40  1.38607e-02      2.07e-05
polygon 19 (hole)       11 -8.36705e-05     -1.25e-07
polygon 20 (hole)        3 -2.33435e-09     -3.49e-12
polygon 21              45  2.51218e-03      3.75e-06
polygon 22             139  3.22293e-03      4.81e-06
polygon 23             148  3.10395e-03      4.64e-06
polygon 24 (hole)        4 -1.72650e-10     -2.58e-13
polygon 25              75  1.73526e-02      2.59e-05
polygon 26              83  5.28920e-03      7.90e-06
polygon 27             106  3.04104e-03      4.54e-06
polygon 28              71  5.63061e-03      8.41e-06
polygon 29              10  1.99717e-04      2.98e-07
polygon 30 (hole)        3 -1.37223e-08     -2.05e-11
polygon 31 (hole)        3 -8.68789e-10     -1.30e-12
polygon 32 (hole)        3 -3.39815e-10     -5.08e-13
polygon 33 (hole)        3 -4.52041e-11     -6.75e-14
polygon 34 (hole)        3 -3.90173e-11     -5.83e-14
polygon 35 (hole)        3 -9.59845e-11     -1.43e-13
polygon 36 (hole)        8 -4.28707e-07     -6.40e-10
polygon 37 (hole)        4 -2.18619e-10     -3.27e-13
polygon 38 (hole)        6 -8.37554e-07     -1.25e-09
polygon 39 (hole)        5 -2.92235e-10     -4.36e-13
polygon 40           14053  6.67892e+02      9.98e-01
polygon 41 (hole)        3 -7.43616e-12     -1.11e-14
enclosing rectangle: [2.66754, 55.94194] x [21.44847, 50.25633] km
                     (53.27 x 28.81 km)
Window area = 669.517 square km
Unit of length: 1 km
Fraction of frame area: 0.436

The code chunk below computes KDE in km, summrise and plot KDE. Now the legend confirms the change of unit measurement.The KDE results show that certain areas (in yellow) have an estimated intensity of more than 30 childcare centres per square kilometre.

kde_childcareSG_km <- density(childcareSG_ppp_km,
                              sigma = bw.diggle,
                              edge = TRUE,
                              kernel = "gaussian")
summary(kde_childcareSG_km)
real-valued pixel image
128 x 128 pixel array (ny, nx)
enclosing rectangle: [2.667538, 55.94194] x [21.44847, 50.25633] km
dimensions of each pixel: 0.416 x 0.2250614 km
Image is defined on a subset of the rectangular grid
Subset area = 669.941961122495 square km
Subset area fraction = 0.437
Pixel values (inside window):
    range = [-5.824417e-15, 30.63698]
    integral = 1927.788
    mean = 2.877545
plot(kde_childcareSG_km)

7.3 Working with Different Automatic Bandwidth Methods

As mentioned before, there are four automatic bandwidth methods. Let’s compare and see the difference for one ppp data.

bw.CvL(childcareSG_ppp_km)
   sigma 
4.357209 
bw.scott(childcareSG_ppp_km)
 sigma.x  sigma.y 
2.159749 1.396455 
bw.ppl(childcareSG_ppp_km)
   sigma 
0.378997 
bw.diggle(childcareSG_ppp_km)
    sigma 
0.2959712 

Baddeley et al. (2016) suggest using bw.ppl() for patterns with many tight clusters, while bw.diggle() is better for detecting a single tight cluster in random noise. The following code compares KDE results using both methods.The KDE results from bw.ppl on the right show that certain areas (in yellow) have an estimated intensity of more than 25 childcare centres per square kilometre. The PPL bandwidth (0.379) is close to Diggle’s (0.296) but avoids excessive spikiness, while being much smaller than CvL (4.357), which produces an overly smoothed surface.

kde_childcareSG.ppl <- density(childcareSG_ppp_km,
                               sigma = bw.ppl,
                               edge = TRUE,
                               kernel = "gaussian")

summary(kde_childcareSG.ppl)
real-valued pixel image
128 x 128 pixel array (ny, nx)
enclosing rectangle: [2.667538, 55.94194] x [21.44847, 50.25633] km
dimensions of each pixel: 0.416 x 0.2250614 km
Image is defined on a subset of the rectangular grid
Subset area = 669.941961122495 square km
Subset area fraction = 0.437
Pixel values (inside window):
    range = [-4.56929e-15, 25.04596]
    integral = 1929.932
    mean = 2.880745
par(mfrow = c(2,2))#shows two plots side by side (1 row, 2 columns).

plot(kde_childcareSG_km, main = "bw.diggle") #main is the title of the plot

plot(kde_childcareSG.ppl, main = "bw.ppl")

plot(density(childcareSG_ppp_km,
              sigma = bw.scott,
              edge = TRUE,
              kernel = "gaussian"),
     main = "bw.sscott")

plot(density(childcareSG_ppp_km,
              sigma = bw.CvL,
              edge = TRUE,
              kernel = "gaussian"),
     main = "bw.CvL")     

7.4 Working with Different kernel Methods

Next we compare difference between kernel methods (Gaussian, Epanechnikov, Quartic and Disc) through the charts.

par(mfrow = c(2,2)) # par= plot appearance settings, mfrow= multi-frame by row e.g. 2 * 2.
plot(density(childcareSG_ppp_km,
             sigma = 0.2959712,
             edge = TRUE,
             kernel = "gaussian"),
             main = "Gaussian")
plot(density(childcareSG_ppp_km,
             sigma = 0.2959712,
             edge = TRUE,
             kernel = "epanechnikov"),
             main = "Epanechnikov")
plot(density(childcareSG_ppp_km,
             sigma = 0.2959712,
             edge = TRUE,
             kernel = "quartic"),
             main = "Quartic")
plot(density(childcareSG_ppp_km,
             sigma = 0.2959712,
             edge = TRUE,
             kernel = "disc"),
             main = "Disc")

When sigma is same, the KDEs look alike between kernel methods, thus to plot a reasonable KDE map, bandwidth choice matters more.

8. Fixed and Adaptive KDE

8.1 Computing KDE by Using Fixed Bandwidth

In the section, we define the bandwidth 600 metres or 0.6 km for the KDE layer.

kde_childcare_fb <- density(childcareSG_ppp_km,
                            sigma = 0.6,
                            edge = TRUE,
                            kernel = "gaussian")
plot(kde_childcare_fb)

8.2 Computing KDE by Using Adaptive Bandwidth

density.adaptive() generates adaptive KDE and helps us to overcome problems such as highly skewed distribution e.g., rural vs urban.

kde_childcareSG_ab <- adaptive.density(
  childcareSG_ppp_km,
  method = "kernel")
plot(kde_childcareSG_ab)

We can compare both side by side.Adaptive KDE reduces visual contrast in hotspots because it balances smoothing between dense and sparse areas, whereas fixed bandwidth highlights dense clusters more strongly.

par(mfrow = c(1,2))
plot(kde_childcare_fb, main = "Fixed bandwidth")
plot(kde_childcareSG_ab, main = "Adaptive bandwidth")

9. Plotting Cartopraphic Quality KDE Map

9.1 Converting Gridded Output into Raster

The chunk below uses rast() from terra package to covert im kernel density object into SpatRaster object which is verified by the class().

kde_childcareSG_bw_terra <- rast(kde_childcareSG_km)
class(kde_childcareSG_bw_terra)
[1] "SpatRaster"
attr(,"package")
[1] "terra"
kde_childcareSG_bw_terra
class       : SpatRaster 
size        : 128, 128, 1  (nrow, ncol, nlyr)
resolution  : 0.4162063, 0.2250614  (x, y)
extent      : 2.667538, 55.94194, 21.44847, 50.25633  (xmin, xmax, ymin, ymax)
coord. ref. :  
source(s)   : memory
name        :         lyr.1 
min value   : -5.824417e-15 
max value   :  3.063698e+01 
unit        :            km 

Here the CSR property (coord.ref.) should be SVY21 but is empty as per properties displayed above.

9.2 Assigning Propjection Systems

The code chunk below assign the correct EPSG 3414 to the SpatRaster object.

crs(kde_childcareSG_bw_terra) <- "EPSG:3414"

kde_childcareSG_bw_terra
class       : SpatRaster 
size        : 128, 128, 1  (nrow, ncol, nlyr)
resolution  : 0.4162063, 0.2250614  (x, y)
extent      : 2.667538, 55.94194, 21.44847, 50.25633  (xmin, xmax, ymin, ymax)
coord. ref. : SVY21 / Singapore TM (EPSG:3414) 
source(s)   : memory
name        :         lyr.1 
min value   : -5.824417e-15 
max value   :  3.063698e+01 
unit        :            km 
Note

It seems that when working with spatial data, some values/attributes may be missing initially or may become missing during data conversion. Therefore, it is important to always do sanity check first.

Before running spatial analysis: simply typing the object name will print its metadata: dimensions, resolution, CRS, range of values, etc.

9.3 Plotting KDE Map with tmap

Let’s plot the KDE map.

tm_shape(kde_childcareSG_bw_terra) + # load the data source for the layers that follow
  tm_raster(col.scale =
              tm_scale_continuous(  # uses a continuous colour scale.
                values = "viridis"),
            col.legend = tm_legend(  # customises the legend for the raster:
              title = "Density values",
              title.size = 0.7,
              text.size = 0.7,
              bg.color = "white",
              bg.alpha = 0.7,
              position = tm_pos_in(
                "right", "bottom"),
              frame = TRUE)) +
  tm_graticules(labels.size = 0.7) + # adds latitude/longitude grid lines
  tm_compass() + # adds a compass
  tm_layout(scale = 1.0) # keeps default size

layer.1 is the default name for the single KDE band. It holds the density values, one for each pixel in the raster grid.?

10. First Order SPPA at the Planning Subzone Level

We focus on the childcare centres in the four areas: Punggol, Tampines, Choa Chu Kang and Jurong West.

10.1 Geospatial Data Wrangling

10.1.1 Extracting the Study Area

The code chunk uses filter() to create a new variable for each area and plot() for quick preview.

pg <- mpsz_cl %>%
  filter(PLN_AREA_N == "PUNGGOL")
tm <- mpsz_cl %>%
  filter(PLN_AREA_N == "TAMPINES")
ck <- mpsz_cl %>%
  filter(PLN_AREA_N == "CHOA CHU KANG")
jw <- mpsz_cl %>%
  filter(PLN_AREA_N == "JURONG WEST")
par(mfrow=c(2,2))
plot(st_geometry(pg), main = "Ponggol")
plot(st_geometry(tm), main = "Tampines")
plot(st_geometry(ck), main = "Choa Chu Kang")
plot(st_geometry(jw), main = "Jurong West")

10.1.2 Creating owin Object Using as.owin()

pg_owin = as.owin(pg)
tm_owin = as.owin(tm)
ck_owin = as.owin(ck)
jw_owin = as.owin(jw)

10.1.3 Combining Point Event Object and Owin Object

The code chunk below subsets the dataset to the study areas, rescales the unit of measurement from metre to kilometre and finally plot the areas with childcare points.

childcare_pg_ppp = childcare_ppp[pg_owin] #crop childcare points to Punggol
childcare_tm_ppp = childcare_ppp[tm_owin]
childcare_ck_ppp = childcare_ppp[ck_owin]
childcare_jw_ppp = childcare_ppp[jw_owin]
childcare_pg_ppp.km = rescale.ppp(childcare_pg_ppp, 1000, "km")
childcare_tm_ppp.km = rescale.ppp(childcare_tm_ppp, 1000, "km")
childcare_ck_ppp.km = rescale.ppp(childcare_ck_ppp, 1000, "km")
childcare_jw_ppp.km = rescale.ppp(childcare_jw_ppp, 1000, "km")
par(mfrow=c(2,2))
plot(unmark(childcare_pg_ppp.km),
  main = "Punggol")
plot(unmark(childcare_tm_ppp.km),
  main = "Tampines")
plot(unmark(childcare_ck_ppp.km),
  main = "Choa Chu Kang")
plot(unmark(childcare_jw_ppp.km),
  main = "Jurong West")

10.2 Clark-Evans Test

A 95% confidence level will be applied for the test.

  • Null Hypothesis: Childcare services are randomly distributed.

  • Alternative Hypothesis: Childcare services are mot randomly distributed.

clarkevans.test(childcare_ck_ppp,
                correction ="none",
                clipregion = NULL,
                alternative = c("two.sided"),
                nsim = 99)

    Clark-Evans test
    No edge correction
    Z-test

data:  childcare_ck_ppp
R = 0.84097, p-value = 0.008866
alternative hypothesis: two-sided
clarkevans.test(childcare_tm_ppp,
                correction ="none",
                clipregion = NULL,
                alternative = c("two.sided"),
                nsim = 99)

    Clark-Evans test
    No edge correction
    Z-test

data:  childcare_tm_ppp
R = 0.66817, p-value = 6.58e-12
alternative hypothesis: two-sided
clarkevans.test(childcare_ck_ppp,
                correction ="none",
                clipregion = NULL,
                alternative = c("two.sided"),
                nsim = 99)

    Clark-Evans test
    No edge correction
    Z-test

data:  childcare_ck_ppp
R = 0.84097, p-value = 0.008866
alternative hypothesis: two-sided
clarkevans.test(childcare_jw_ppp,
                correction ="none",
                clipregion = NULL,
                alternative = c("two.sided"),
                nsim = 99)

    Clark-Evans test
    No edge correction
    Z-test

data:  childcare_jw_ppp
R = 0.68968, p-value = 4.772e-10
alternative hypothesis: two-sided

Statistical Conclusion Based on the p-values for each study area, we reject all null hypotheses of complete spatial randomness. This indicates that childcare centres are significantly clustered rather than randomly distributed.

Business Communication While the analysis confirms that childcare centres are spatially clustered, this alone does not establish that the clusters coincide with residential hubs. Some residential complexes may lack nearby childcare facilities, while other areas may be overserved. To verify this relationship, further analysis with additional datasets (e.g., population density and housing distribution) is required.

10.3 Computing KDE Surfaces By Planning Area

par(mfrow=c(2,2))
plot(density(childcare_pg_ppp.km,
              sigma = bw.diggle,
              edge = TRUE,
              kernel = "gaussian"),
      main = "Punggol")

plot(density(childcare_tm_ppp.km,
              sigma = bw.diggle,
              edge = TRUE,
              kernel = "gaussian"),
      main = "Tempines")

plot(density(childcare_ck_ppp.km,
              sigma = bw.diggle,
              edge = TRUE,
              kernel = "gaussian"),
      main = "Choa Chu Kang")

plot(density(childcare_jw_ppp.km,
              sigma = bw.diggle,
              edge = TRUE,
              kernel = "gaussian"),
      main = "Jurong West")

par(mfrow=c(2,2))
plot(density(childcare_pg_ppp.km,
              sigma = bw.ppl,
              edge = TRUE,
              kernel = "gaussian"),
      main = "Punggol")

plot(density(childcare_tm_ppp.km,
              sigma = bw.ppl,
              edge = TRUE,
              kernel = "gaussian"),
      main = "Tempines")

plot(density(childcare_ck_ppp.km,
              sigma = bw.ppl,
              edge = TRUE,
              kernel = "gaussian"),
      main = "Choa Chu Kang")

plot(density(childcare_jw_ppp.km,
              sigma = bw.ppl,
              edge = TRUE,
              kernel = "gaussian"),
      main = "Jurong West")

par(mfrow=c(2,2))
plot(adaptive.density(childcare_pg_ppp.km,
              method = "kernel"),
      main = "Punggol")

plot(adaptive.density(childcare_tm_ppp.km,
              method = "kernel"),
      main = "Tempines")

plot(adaptive.density(childcare_ck_ppp.km,
              method = "kernel"),
      main = "Choa Chu Kang")

plot(adaptive.density(childcare_jw_ppp.km,
              method = "kernel"),
      main = "Jurong West")

KDE results show that childcare distribution patterns differ across planning areas, but none are evenly distributed. At this stage, we cannot confirm whether the clusters align with residential areas. Further analysis with population and housing data is needed to assess whether the current supply matches local demand to identify under-served areas.

##11 Reference Kam, T. S. 1st Order Spatial Point Patterns Analysis Methods. R for Geospatial Data Science and Analytics. https://r4gdsa.netlify.app/chap04